Best Arm Identification for Contaminated Bandits

نویسندگان

Jason Altschuler

Victor-Emmanuel Brunel

Alan Malek

چکیده

This paper studies active learning in the context of robust statistics. Specifically, we propose the Contaminated Best Arm Identification variant of the multi-armed bandit problem, in which every arm pull has probability ε of generating a sample from an arbitrary contamination distribution instead of the true underlying distribution. The goal is to identify the best (or approximately best) true distribution with high probability, with a secondary goal of providing guarantees on the quality of that arm’s underlying distribution. It is simple to see that in this contamination model there are no consistent estimators for statistics (e.g. median) of the underlying distribution, and that even with infinite samples, statistics can be estimated only up to some unavoidable bias. We present tight, non-asymptotic sample complexity bounds for estimating the first two robust moments (median and median absolute deviation) with high probability. We then show how to use this algorithmically for our problem by adapting Best Arm Identification algorithms from the classical multi-armed bandit literature. We give matching upper and lower bounds (up to a small logarithmic factor) on these algorithms’ sample complexities. These results suggest an inherent robustness of classical Best Arm Identification algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PAC Bandits with Risk Constraints

We study the problem of best arm identification with risk constraints within the setting of fixed confidence pure exploration bandits (PAC bandits). The goal is to stop as fast as possible, and with high confidence return an arm whose mean is -close to the best arm among those that satisfy a risk constraint, namely their α-quantile functions are larger than a threshold β. For this risk-sensitiv...

متن کامل

Practical Algorithms for Best-K Identification in Multi-Armed Bandits

In the Best-K identification problem (Best-K-Arm), we are given N stochastic bandit arms with unknown reward distributions. Our goal is to identify the K arms with the largest means with high confidence, by drawing samples from the arms adaptively. This problem is motivated by various practical applications and has attracted considerable attention in the past decade. In this paper, we propose n...

متن کامل

Best-Arm Identification in Linear Bandits

We study the best-arm identification problem in linear bandit, where the rewards of the arms depend linearly on an unknown parameter θ and the objective is to return the arm with the largest reward. We characterize the complexity of the problem and introduce sample allocation strategies that pull arms to identify the best arm with a fixed confidence, while minimizing the sample budget. In parti...

متن کامل

One Practical Algorithm for Both Stochastic and Adversarial Bandits

We present an algorithm for multiarmed bandits that achieves almost optimal performance in both stochastic and adversarial regimes without prior knowledge about the nature of the environment. Our algorithm is based on augmentation of the EXP3 algorithm with a new control lever in the form of exploration parameters that are tailored individually for each arm. The algorithm simultaneously applies...

متن کامل

Best arm identification in multi-armed bandits with delayed feedback

We propose a generalization of the best arm identification problem in stochastic multiarmed bandits (MAB) to the setting where every pull of an arm is associated with delayed feedback. The delay in feedback increases the effective sample complexity of standard algorithms, but can be offset if we have access to partial feedback received before a pull is completed. We propose a general framework ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2018

Best Arm Identification for Contaminated Bandits

نویسندگان

چکیده

منابع مشابه

PAC Bandits with Risk Constraints

Practical Algorithms for Best-K Identification in Multi-Armed Bandits

Best-Arm Identification in Linear Bandits

One Practical Algorithm for Both Stochastic and Adversarial Bandits

Best arm identification in multi-armed bandits with delayed feedback

عنوان ژورنال:

اشتراک گذاری